Overview

Dataset statistics

Number of variables26
Number of observations383245
Missing cells475363
Missing cells (%)4.8%
Duplicate rows22637
Duplicate rows (%)5.9%
Total size in memory76.0 MiB
Average record size in memory208.0 B

Variable types

CAT16
NUM10

Reproduction

Analysis started2021-11-08 10:27:55.336769
Analysis finished2021-11-08 10:30:20.251279
Duration2 minutes and 24.91 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

RECORD DATE has constant value "11/07/2021" Constant
Dataset has 22637 (5.9%) duplicate rows Duplicates
DBA has a high cardinality: 22984 distinct values High cardinality
BUILDING has a high cardinality: 7681 distinct values High cardinality
STREET has a high cardinality: 2473 distinct values High cardinality
PHONE has a high cardinality: 27299 distinct values High cardinality
CUISINE DESCRIPTION has a high cardinality: 86 distinct values High cardinality
INSPECTION DATE has a high cardinality: 1509 distinct values High cardinality
VIOLATION CODE has a high cardinality: 105 distinct values High cardinality
VIOLATION DESCRIPTION has a high cardinality: 106 distinct values High cardinality
GRADE DATE has a high cardinality: 1340 distinct values High cardinality
NTA has a high cardinality: 193 distinct values High cardinality
Longitude is highly correlated with LatitudeHigh correlation
Latitude is highly correlated with LongitudeHigh correlation
BIN is highly correlated with Community Board and 1 other fieldsHigh correlation
Community Board is highly correlated with BIN and 1 other fieldsHigh correlation
BBL is highly correlated with Community Board and 1 other fieldsHigh correlation
ZIPCODE has 5606 (1.5%) missing values Missing
CUISINE DESCRIPTION has 4415 (1.2%) missing values Missing
ACTION has 4414 (1.2%) missing values Missing
VIOLATION CODE has 9069 (2.4%) missing values Missing
VIOLATION DESCRIPTION has 6633 (1.7%) missing values Missing
SCORE has 17949 (4.7%) missing values Missing
GRADE has 189746 (49.5%) missing values Missing
GRADE DATE has 194484 (50.7%) missing values Missing
INSPECTION TYPE has 4414 (1.2%) missing values Missing
Community Board has 6621 (1.7%) missing values Missing
Council District has 6621 (1.7%) missing values Missing
Census Tract has 6621 (1.7%) missing values Missing
BIN has 8384 (2.2%) missing values Missing
NTA has 6621 (1.7%) missing values Missing
Latitude has 5581 (1.5%) zeros Zeros
Longitude has 5581 (1.5%) zeros Zeros

Variables

CAMIS
Real number (ℝ≥0)

Distinct count29824
Unique (%)7.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46443726.55225248
Minimum30075445
Maximum50116984
Zeros0
Zeros (%)0.0%
Memory size2.9 MiB

Quantile statistics

Minimum30075445
5-th percentile40567158
Q141442734
median50013536
Q350064308
95-th percentile50095010
Maximum50116984
Range20041539
Interquartile range (IQR)8621574

Descriptive statistics

Standard deviation4356588.875
Coefficient of variation (CV)0.09380360273
Kurtosis-1.820673534
Mean46443726.55
Median Absolute Deviation (MAD)72922
Skewness-0.3882643003
Sum1.779932598e+13
Variance1.897986662e+13
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
50035784104< 0.1%
 
4040081199< 0.1%
 
4118757787< 0.1%
 
4168381687< 0.1%
 
4140944185< 0.1%
 
5000178984< 0.1%
 
4142580383< 0.1%
 
4147488283< 0.1%
 
5006465482< 0.1%
 
5001211382< 0.1%
 
Other values (29814)38236999.8%
 
ValueCountFrequency (%) 
3007544512< 0.1%
 
3011234012< 0.1%
 
3019184110< 0.1%
 
4035601811< 0.1%
 
403564839< 0.1%
 
ValueCountFrequency (%) 
501169841< 0.1%
 
501169831< 0.1%
 
501169811< 0.1%
 
501169801< 0.1%
 
501169791< 0.1%
 

DBA
Categorical

HIGH CARDINALITY

Distinct count22984
Unique (%)6.0%
Missing1169
Missing (%)0.3%
Memory size2.9 MiB
DUNKIN
 
4068
SUBWAY
 
2664
STARBUCKS
 
1990
MCDONALD'S
 
1876
KENNEDY FRIED CHICKEN
 
1209
Other values (22979)
370269
ValueCountFrequency (%) 
DUNKIN40681.1%
 
SUBWAY26640.7%
 
STARBUCKS19900.5%
 
MCDONALD'S18760.5%
 
KENNEDY FRIED CHICKEN12090.3%
 
CROWN FRIED CHICKEN10460.3%
 
BURGER KING10190.3%
 
POPEYES8620.2%
 
GOLDEN KRUST CARIBBEAN BAKERY & GRILL8440.2%
 
DUNKIN',' BASKIN ROBBINS6530.2%
 
Other values (22974)36584595.5%
 
(Missing)11690.3%
 

Length

Max length95
Median length15
Mean length16.36179728
Min length2

BORO
Categorical

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.9 MiB
Manhattan
149713
Brooklyn
96998
Queens
88594
Bronx
35584
Staten Island
 
12238
ValueCountFrequency (%) 
Manhattan14971339.1%
 
Brooklyn9699825.3%
 
Queens8859423.1%
 
Bronx355849.3%
 
Staten Island122383.2%
 
0118< 0.1%
 

Length

Max length13
Median length8
Mean length7.807269501
Min length1

BUILDING
Categorical

HIGH CARDINALITY

Distinct count7681
Unique (%)2.0%
Missing737
Missing (%)0.2%
Memory size2.9 MiB
1
 
2255
200
 
1266
2
 
1218
0
 
1110
10
 
1049
Other values (7676)
375610
ValueCountFrequency (%) 
122550.6%
 
20012660.3%
 
212180.3%
 
011100.3%
 
1010490.3%
 
5510170.3%
 
259370.2%
 
118880.2%
 
758810.2%
 
608730.2%
 
Other values (7671)37101496.8%
 

Length

Max length10
Median length3
Mean length3.454229018
Min length1

STREET
Categorical

HIGH CARDINALITY

Distinct count2473
Unique (%)0.6%
Missing25
Missing (%)< 0.1%
Memory size2.9 MiB
BROADWAY
 
13811
3 AVENUE
 
8349
2 AVENUE
 
6655
5 AVENUE
 
6601
8 AVENUE
 
5538
Other values (2468)
342266
ValueCountFrequency (%) 
BROADWAY138113.6%
 
3 AVENUE83492.2%
 
2 AVENUE66551.7%
 
5 AVENUE66011.7%
 
8 AVENUE55381.4%
 
1 AVENUE50961.3%
 
7 AVENUE45811.2%
 
AMSTERDAM AVENUE43521.1%
 
LEXINGTON AVENUE42921.1%
 
9 AVENUE41731.1%
 
Other values (2463)31977283.4%
 

Length

Max length40
Median length13
Mean length13.0911323
Min length3

ZIPCODE
Real number (ℝ≥0)

MISSING

Distinct count229
Unique (%)0.1%
Missing5606
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean10682.982525639565
Minimum10000.0
Maximum30339.0
Zeros0
Zeros (%)0.0%
Memory size2.9 MiB

Quantile statistics

Minimum10000
5-th percentile10003
Q110022
median10469
Q311229
95-th percentile11417
Maximum30339
Range20339
Interquartile range (IQR)1207

Descriptive statistics

Standard deviation602.0356606
Coefficient of variation (CV)0.05635464245
Kurtosis19.34759036
Mean10682.98253
Median Absolute Deviation (MAD)468
Skewness0.6366501274
Sum4034310838
Variance362446.9366
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1000395412.5%
 
1001991782.4%
 
1003682202.1%
 
1000280482.1%
 
1001379402.1%
 
1000168211.8%
 
1122067111.8%
 
1001665471.7%
 
1135464921.7%
 
1002264921.7%
 
Other values (219)30164978.7%
 
ValueCountFrequency (%) 
1000030< 0.1%
 
1000168211.8%
 
1000280482.1%
 
1000395412.5%
 
1000419310.5%
 
ValueCountFrequency (%) 
303397< 0.1%
 
201471< 0.1%
 
1234511< 0.1%
 
1169751< 0.1%
 
116943750.1%
 

PHONE
Categorical

HIGH CARDINALITY

Distinct count27299
Unique (%)7.1%
Missing28
Missing (%)< 0.1%
Memory size2.9 MiB
7185958100
 
297
__________
 
175
2126159700
 
168
7182246030
 
160
9176186310
 
155
Other values (27294)
382262
ValueCountFrequency (%) 
71859581002970.1%
 
__________175< 0.1%
 
2126159700168< 0.1%
 
7182246030160< 0.1%
 
9176186310155< 0.1%
 
2124656273155< 0.1%
 
9172843260150< 0.1%
 
9175665727142< 0.1%
 
7182153308138< 0.1%
 
6463218563135< 0.1%
 
Other values (27289)38154299.6%
 

Length

Max length12
Median length10
Mean length9.99959295
Min length3

CUISINE DESCRIPTION
Categorical

HIGH CARDINALITY
MISSING

Distinct count86
Unique (%)< 0.1%
Missing4415
Missing (%)1.2%
Memory size2.9 MiB
American
73129
Chinese
 
39789
Pizza
 
23452
Coffee/Tea
 
18882
Latin American
 
17121
Other values (81)
206457
ValueCountFrequency (%) 
American7312919.1%
 
Chinese3978910.4%
 
Pizza234526.1%
 
Coffee/Tea188824.9%
 
Latin American171214.5%
 
Italian156344.1%
 
Mexican141263.7%
 
Japanese140433.7%
 
Caribbean139203.6%
 
Bakery Products/Desserts117373.1%
 
Other values (76)13699735.7%
 

Length

Max length30
Median length8
Mean length9.226194732
Min length3

INSPECTION DATE
Categorical

HIGH CARDINALITY

Distinct count1509
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.9 MiB
01/01/1900
 
4414
11/07/2019
 
846
03/28/2019
 
833
10/30/2019
 
821
10/23/2019
 
802
Other values (1504)
375529
ValueCountFrequency (%) 
01/01/190044141.2%
 
11/07/20198460.2%
 
03/28/20198330.2%
 
10/30/20198210.2%
 
10/23/20198020.2%
 
11/13/20197990.2%
 
03/03/20207900.2%
 
09/11/20197890.2%
 
01/14/20207850.2%
 
01/22/20207790.2%
 
Other values (1499)37158797.0%
 

Length

Max length10
Median length10
Mean length10
Min length10

ACTION
Categorical

MISSING

Distinct count5
Unique (%)< 0.1%
Missing4414
Missing (%)1.2%
Memory size2.9 MiB
Violations were cited in the following area(s).
354948
Establishment Closed by DOHMH. Violations were cited in the following area(s) and those requiring immediate action were addressed.
 
15024
No violations were recorded at the time of this inspection.
 
4552
Establishment re-opened by DOHMH.
 
4241
Establishment re-closed by DOHMH.
 
66
ValueCountFrequency (%) 
Violations were cited in the following area(s).35494892.6%
 
Establishment Closed by DOHMH. Violations were cited in the following area(s) and those requiring immediate action were addressed.150243.9%
 
No violations were recorded at the time of this inspection.45521.2%
 
Establishment re-opened by DOHMH.42411.1%
 
Establishment re-closed by DOHMH.66< 0.1%
 
(Missing)44141.2%
 

Length

Max length130
Median length47
Mean length49.73220003
Min length3

VIOLATION CODE
Categorical

HIGH CARDINALITY
MISSING

Distinct count105
Unique (%)< 0.1%
Missing9069
Missing (%)2.4%
Memory size2.9 MiB
10F
65389
08A
40111
04L
 
26426
06D
 
24480
06C
 
23881
Other values (100)
193889
ValueCountFrequency (%) 
10F6538917.1%
 
08A4011110.5%
 
04L264266.9%
 
06D244806.4%
 
06C238816.2%
 
10B225935.9%
 
02G223605.8%
 
04N195375.1%
 
02B181644.7%
 
04M80072.1%
 
Other values (95)10322826.9%
 
(Missing)90692.4%
 

Length

Max length4
Median length3
Mean length3.004634111
Min length3

VIOLATION DESCRIPTION
Categorical

HIGH CARDINALITY
MISSING

Distinct count106
Unique (%)< 0.1%
Missing6633
Missing (%)1.7%
Memory size2.9 MiB
Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.
65510
Facility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist.
40190
Evidence of mice or live mice present in facility's food and/or non-food areas.
 
26476
Food contact surface not properly washed, rinsed and sanitized after each use and following any activity when contamination may have occurred.
 
24561
Food not protected from potential source of contamination during storage, preparation, transportation, display or service.
 
24025
Other values (101)
195850
ValueCountFrequency (%) 
Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.6551017.1%
 
Facility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist.4019010.5%
 
Evidence of mice or live mice present in facility's food and/or non-food areas.264766.9%
 
Food contact surface not properly washed, rinsed and sanitized after each use and following any activity when contamination may have occurred.245616.4%
 
Food not protected from potential source of contamination during storage, preparation, transportation, display or service.240256.3%
 
Plumbing not properly installed or maintained; anti-siphonage or backflow prevention device not provided where required; equipment or floor not properly drained; sewage disposal system in disrepair or not functioning properly.226305.9%
 
Cold food item held above 41º F (smoked fish and reduced oxygen packaged foods above 38 ºF) except during necessary preparation.224755.9%
 
Filth flies or food/refuse/sewage-associated (FRSA) flies present in facility’s food and/or non-food areas. Filth flies include house flies, little house flies, blow flies, bottle flies and flesh flies. Food/refuse/sewage-associated flies include fruit flies, drain flies and Phorid flies.195645.1%
 
Hot food item not held at or above 140º F.182524.8%
 
Live roaches present in facility's food and/or non-food areas.80132.1%
 
Other values (96)10491627.4%
 

Length

Max length360
Median length130
Mean length150.2514084
Min length3

CRITICAL FLAG
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.9 MiB
Critical
201359
Not Critical
175253
Not Applicable
 
6633
ValueCountFrequency (%) 
Critical20135952.5%
 
Not Critical17525345.7%
 
Not Applicable66331.7%
 

Length

Max length14
Median length8
Mean length9.932993255
Min length8

SCORE
Real number (ℝ≥0)

MISSING

Distinct count136
Unique (%)< 0.1%
Missing17949
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean20.411723643292014
Minimum0.0
Maximum164.0
Zeros2395
Zeros (%)0.6%
Memory size2.9 MiB

Quantile statistics

Minimum0
5-th percentile5
Q111
median15
Q326
95-th percentile50
Maximum164
Range164
Interquartile range (IQR)15

Descriptive statistics

Standard deviation14.98849076
Coefficient of variation (CV)0.734307941
Kurtosis6.51255858
Mean20.41172364
Median Absolute Deviation (MAD)6
Skewness2.028946687
Sum7456321
Variance224.6548553
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
123869810.1%
 
13321118.4%
 
10211535.5%
 
11199975.2%
 
9181314.7%
 
7132003.4%
 
1999082.6%
 
2098042.6%
 
2192932.4%
 
2292222.4%
 
Other values (126)18377948.0%
 
(Missing)179494.7%
 
ValueCountFrequency (%) 
023950.6%
 
243891.1%
 
329920.8%
 
446001.2%
 
565181.7%
 
ValueCountFrequency (%) 
16411< 0.1%
 
15714< 0.1%
 
15311< 0.1%
 
15110< 0.1%
 
1509< 0.1%
 

GRADE
Categorical

MISSING

Distinct count7
Unique (%)< 0.1%
Missing189746
Missing (%)49.5%
Memory size2.9 MiB
A
152215
B
 
23596
C
 
9242
N
 
4745
P
 
2493
Other values (2)
 
1208
ValueCountFrequency (%) 
A15221539.7%
 
B235966.2%
 
C92422.4%
 
N47451.2%
 
P24930.7%
 
Z12070.3%
 
G1< 0.1%
 
(Missing)18974649.5%
 

Length

Max length3
Median length1
Mean length1.990207309
Min length1

GRADE DATE
Categorical

HIGH CARDINALITY
MISSING

Distinct count1340
Unique (%)0.7%
Missing194484
Missing (%)50.7%
Memory size2.9 MiB
06/05/2019
 
496
06/13/2019
 
479
06/11/2019
 
474
05/09/2019
 
469
06/12/2019
 
460
Other values (1335)
186383
ValueCountFrequency (%) 
06/05/20194960.1%
 
06/13/20194790.1%
 
06/11/20194740.1%
 
05/09/20194690.1%
 
06/12/20194600.1%
 
05/30/20194510.1%
 
08/01/20194450.1%
 
11/07/20194400.1%
 
06/04/20194290.1%
 
05/01/20194290.1%
 
Other values (1330)18418948.1%
 
(Missing)19448450.7%
 

Length

Max length10
Median length3
Mean length6.447734478
Min length3

RECORD DATE
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.9 MiB
11/07/2021
383245
ValueCountFrequency (%) 
11/07/2021383245100.0%
 

Length

Max length10
Median length10
Mean length10
Min length10

INSPECTION TYPE
Categorical

MISSING

Distinct count31
Unique (%)< 0.1%
Missing4414
Missing (%)1.2%
Memory size2.9 MiB
Cycle Inspection / Initial Inspection
223096
Cycle Inspection / Re-inspection
86950
Pre-permit (Operational) / Initial Inspection
 
31196
Pre-permit (Operational) / Re-inspection
 
12135
Administrative Miscellaneous / Initial Inspection
 
6835
Other values (26)
 
18619
ValueCountFrequency (%) 
Cycle Inspection / Initial Inspection22309658.2%
 
Cycle Inspection / Re-inspection8695022.7%
 
Pre-permit (Operational) / Initial Inspection311968.1%
 
Pre-permit (Operational) / Re-inspection121353.2%
 
Administrative Miscellaneous / Initial Inspection68351.8%
 
Cycle Inspection / Reopening Inspection41471.1%
 
Pre-permit (Non-operational) / Initial Inspection33340.9%
 
Smoke-Free Air Act / Initial Inspection17560.5%
 
Administrative Miscellaneous / Re-inspection16680.4%
 
Trans Fat / Initial Inspection12940.3%
 
Other values (21)64201.7%
 
(Missing)44141.2%
 

Length

Max length59
Median length37
Mean length36.66562904
Min length3

Latitude
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count23574
Unique (%)6.2%
Missing383
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean40.135671666060155
Minimum0.0
Maximum40.912822326386
Zeros5581
Zeros (%)1.5%
Memory size2.9 MiB

Quantile statistics

Minimum0
5-th percentile40.60039481
Q140.6864742
median40.73281684
Q340.76202067
95-th percentile40.85213353
Maximum40.91282233
Range40.91282233
Interquartile range (IQR)0.0755464773

Descriptive statistics

Standard deviation4.881977637
Coefficient of variation (CV)0.121636874
Kurtosis63.59138685
Mean40.13567167
Median Absolute Deviation (MAD)0.03424014662
Skewness-8.098016739
Sum15366423.53
Variance23.83370565
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
055811.5%
 
40.6483128310680.3%
 
40.759777914780.1%
 
40.733840184370.1%
 
40.7585023560.1%
 
40.741869043560.1%
 
40.582297423520.1%
 
40.774414033340.1%
 
40.865905412830.1%
 
40.752093932650.1%
 
Other values (23564)37335297.4%
 
(Missing)3830.1%
 
ValueCountFrequency (%) 
055811.5%
 
40.499562711< 0.1%
 
40.5080685210< 0.1%
 
40.5091146517< 0.1%
 
40.509135821< 0.1%
 
ValueCountFrequency (%) 
40.9128223315< 0.1%
 
40.910558297< 0.1%
 
40.9104733616< 0.1%
 
40.910011881< 0.1%
 
40.909823011< 0.1%
 

Longitude
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct count23574
Unique (%)6.2%
Missing383
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean-72.86426071780869
Minimum-74.249101331725
Maximum0.0
Zeros5581
Zeros (%)1.5%
Memory size2.9 MiB

Quantile statistics

Minimum-74.24910133
5-th percentile-74.01347541
Q1-73.9888196
median-73.95731359
Q3-73.89735664
95-th percentile-73.79009475
Maximum0
Range74.24910133
Interquartile range (IQR)0.09146296512

Descriptive statistics

Standard deviation8.86244733
Coefficient of variation (CV)-0.1216295512
Kurtosis63.60741425
Mean-72.86426072
Median Absolute Deviation (MAD)0.03786817541
Skewness8.099523347
Sum-27896956.59
Variance78.54297268
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
055811.5%
 
-73.788281510680.3%
 
-73.829235434780.1%
 
-73.871577034370.1%
 
-74.004713013560.1%
 
-73.833241813560.1%
 
-74.169052593520.1%
 
-73.877293353340.1%
 
-73.830429752830.1%
 
-73.977604352650.1%
 
Other values (23564)37335297.4%
 
(Missing)3830.1%
 
ValueCountFrequency (%) 
-74.249101331< 0.1%
 
-74.248707926< 0.1%
 
-74.248502151< 0.1%
 
-74.248434474< 0.1%
 
-74.2483721815< 0.1%
 
ValueCountFrequency (%) 
055811.5%
 
-73.7009280616< 0.1%
 
-73.7017118724< 0.1%
 
-73.7026668521< 0.1%
 
-73.702681322< 0.1%
 

Community Board
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count69
Unique (%)< 0.1%
Missing6621
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean248.75987456986277
Minimum101.0
Maximum595.0
Zeros0
Zeros (%)0.0%
Memory size2.9 MiB

Quantile statistics

Minimum101
5-th percentile102
Q1105
median301
Q3401
95-th percentile412
Maximum595
Range494
Interquartile range (IQR)296

Descriptive statistics

Standard deviation130.1658143
Coefficient of variation (CV)0.5232588839
Kurtosis-1.432718466
Mean248.7598746
Median Absolute Deviation (MAD)106
Skewness0.1538379583
Sum93688939
Variance16943.13922
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105308108.0%
 
103203265.3%
 
102180434.7%
 
104153714.0%
 
407135293.5%
 
106126223.3%
 
301121233.2%
 
108116503.0%
 
401113233.0%
 
101109622.9%
 
Other values (59)21986557.4%
 
ValueCountFrequency (%) 
101109622.9%
 
102180434.7%
 
103203265.3%
 
104153714.0%
 
105308108.0%
 
ValueCountFrequency (%) 
59511< 0.1%
 
50329160.8%
 
50237081.0%
 
50154031.4%
 
4836600.2%
 

Council District
Real number (ℝ≥0)

MISSING

Distinct count51
Unique (%)< 0.1%
Missing6621
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean20.000985067335062
Minimum1.0
Maximum51.0
Zeros0
Zeros (%)0.0%
Memory size2.9 MiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median20
Q334
95-th percentile47
Maximum51
Range50
Interquartile range (IQR)30

Descriptive statistics

Standard deviation15.70688938
Coefficient of variation (CV)0.7853057898
Kurtosis-1.282555491
Mean20.00098507
Median Absolute Deviation (MAD)16
Skewness0.3075293446
Sum7532851
Variance246.7063738
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3330788.6%
 
1297437.8%
 
4292217.6%
 
2198945.2%
 
33121593.2%
 
20109932.9%
 
34106632.8%
 
26102682.7%
 
38102162.7%
 
3995512.5%
 
Other values (41)20083852.4%
 
ValueCountFrequency (%) 
1297437.8%
 
2198945.2%
 
3330788.6%
 
4292217.6%
 
583662.2%
 
ValueCountFrequency (%) 
5134180.9%
 
5037621.0%
 
4948581.3%
 
4847051.2%
 
4751121.3%
 

Census Tract
Real number (ℝ≥0)

MISSING

Distinct count1190
Unique (%)0.3%
Missing6621
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean28842.442327095458
Minimum100.0
Maximum162100.0
Zeros0
Zeros (%)0.0%
Memory size2.9 MiB

Quantile statistics

Minimum100
5-th percentile1800
Q18000
median16300
Q340500
95-th percentile93000
Maximum162100
Range162000
Interquartile range (IQR)32500

Descriptive statistics

Standard deviation30501.43401
Coefficient of variation (CV)1.057519113
Kurtosis2.813472926
Mean28842.44233
Median Absolute Deviation (MAD)11000
Skewness1.736079317
Sum1.0862756e+10
Variance930337476.8
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
8710032490.8%
 
650031680.8%
 
410030150.8%
 
380029910.8%
 
1040026570.7%
 
210025980.7%
 
630025540.7%
 
1370024220.6%
 
1210023930.6%
 
920023900.6%
 
Other values (1180)34918791.1%
 
(Missing)66211.7%
 
ValueCountFrequency (%) 
1004950.1%
 
2003870.1%
 
20137< 0.1%
 
202149< 0.1%
 
3006830.2%
 
ValueCountFrequency (%) 
16210029< 0.1%
 
161700153< 0.1%
 
15790341< 0.1%
 
157902125< 0.1%
 
15790193< 0.1%
 

BIN
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count20399
Unique (%)5.4%
Missing8384
Missing (%)2.2%
Infinite0
Infinite (%)0.0%
Mean2513323.3370049163
Minimum1000000.0
Maximum5799501.0
Zeros0
Zeros (%)0.0%
Memory size2.9 MiB

Quantile statistics

Minimum1000000
5-th percentile1005449
Q11044044
median3008393
Q34002472
95-th percentile4447721
Maximum5799501
Range4799501
Interquartile range (IQR)2958428

Descriptive statistics

Standard deviation1346335.526
Coefficient of variation (CV)0.5356793956
Kurtosis-1.449867594
Mean2513323.337
Median Absolute Deviation (MAD)1184997
Skewness0.1630198287
Sum9.421468994e+11
Variance1.812619349e+12
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
400000017630.5%
 
10000005570.1%
 
10353814890.1%
 
41135464780.1%
 
40459994380.1%
 
10125414110.1%
 
41122763990.1%
 
30000003900.1%
 
50396583520.1%
 
20000003210.1%
 
Other values (20389)36926396.4%
 
(Missing)83842.2%
 
ValueCountFrequency (%) 
10000005570.1%
 
100000311< 0.1%
 
1000005116< 0.1%
 
100000613< 0.1%
 
10000072< 0.1%
 
ValueCountFrequency (%) 
5799501111< 0.1%
 
51704081< 0.1%
 
51702201< 0.1%
 
51690297< 0.1%
 
51667134< 0.1%
 

BBL
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count20044
Unique (%)5.2%
Missing1040
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean2403734955.1260633
Minimum1.0
Maximum5270000501.0
Zeros0
Zeros (%)0.0%
Memory size2.9 MiB

Quantile statistics

Minimum1
5-th percentile1001980047
Q11010450001
median3001867501
Q34001650006
95-th percentile4115160200
Maximum5270000501
Range5270000500
Interquartile range (IQR)2991200005

Descriptive statistics

Standard deviation1338922432
Coefficient of variation (CV)0.5570174986
Kurtosis-1.358584989
Mean2403734955
Median Absolute Deviation (MAD)1065042536
Skewness0.1382735616
Sum9.187195185e+14
Variance1.792713279e+18
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
134640.9%
 
417260.5%
 
311220.3%
 
27750.2%
 
41426000016600.2%
 
10128000014890.1%
 
40501900054780.1%
 
40186001004380.1%
 
10071300014110.1%
 
40497300163990.1%
 
Other values (20034)37224397.1%
 
(Missing)10400.3%
 
ValueCountFrequency (%) 
134640.9%
 
27750.2%
 
311220.3%
 
417260.5%
 
52570.1%
 
ValueCountFrequency (%) 
5270000501111< 0.1%
 
50804700211< 0.1%
 
50804700167< 0.1%
 
508046000127< 0.1%
 
50804300191< 0.1%
 

NTA
Categorical

HIGH CARDINALITY
MISSING

Distinct count193
Unique (%)0.1%
Missing6621
Missing (%)1.7%
Memory size2.9 MiB
MN17
 
22776
MN23
 
11458
MN13
 
10569
MN24
 
10491
MN27
 
9317
Other values (188)
312013
ValueCountFrequency (%) 
MN17227765.9%
 
MN23114583.0%
 
MN13105692.8%
 
MN24104912.7%
 
MN2793172.4%
 
MN2287262.3%
 
MN1985832.2%
 
MN1580222.1%
 
QN2276402.0%
 
MN2569121.8%
 
Other values (183)27213071.0%
 
(Missing)66211.7%
 

Length

Max length4
Median length4
Mean length3.982723845
Min length3

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

CAMISDBABOROBUILDINGSTREETZIPCODEPHONECUISINE DESCRIPTIONINSPECTION DATEACTIONVIOLATION CODEVIOLATION DESCRIPTIONCRITICAL FLAGSCOREGRADEGRADE DATERECORD DATEINSPECTION TYPELatitudeLongitudeCommunity BoardCouncil DistrictCensus TractBINBBLNTA
050002262SOUTHEAST BAKERYBrooklyn6821FORT HAMILTON PARKWAY11219.07186801188Chinese12/12/2019Violations were cited in the following area(s).06APersonal cleanliness inadequate. Outer garment soiled with possible contaminant. Effective hair restraint not worn in an area where food is prepared.Critical28.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.629009-74.011629310.043.021000.03143040.03.057710e+09BK30
141031137STARBUCKSManhattan1500BROADWAY10036.02122217515Coffee/Tea07/27/2021Violations were cited in the following area(s).04NFilth flies or food/refuse/sewage-associated (FRSA) flies present in facility’s food and/or non-food areas. Filth flies include house flies, little house flies, blow flies, bottle flies and flesh flies. Food/refuse/sewage-associated flies include fruit flies, drain flies and Phorid flies.Critical24.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.756849-73.985973105.04.011900.01022610.01.009960e+09MN17
241471964DRUNKEN HORSEManhattan22510 AVENUE10011.02126040505American12/10/2019Violations were cited in the following area(s).06CFood not protected from potential source of contamination during storage, preparation, transportation, display or service.Critical12.0A12/10/201911/07/2021Cycle Inspection / Initial Inspection40.747910-74.004013104.03.09900.01012354.01.006950e+09MN13
350053120DELICIAS MEXICANOSQueens10214ROOSEVELT AVE11368.07186724485Mexican02/26/2019Violations were cited in the following area(s).02BHot food item not held at or above 140º F.Critical30.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.749743-73.863819404.021.040500.04437314.04.019740e+09QN26
441640846RANDOLPH BEERManhattan343BROOME STREET10013.02123343706American10/16/2017Violations were cited in the following area(s).09CFood contact surface not properly maintained.Not Critical33.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.719650-73.994769102.01.04100.01006944.01.004700e+09MN24
541630632J J NOODLEManhattan19HENRY STREET10002.02125712440Chinese04/30/2018Violations were cited in the following area(s).08AFacility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist.Not Critical33.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.712854-73.996487103.01.0800.01003431.01.002800e+09MN27
650054342FINE BAKERY CITY INCManhattan303GRAND STREET10002.02129663318Bakery Products/Desserts08/21/2018Violations were cited in the following area(s).10BPlumbing not properly installed or maintained; anti-siphonage or backflow prevention device not provided where required; equipment or floor not properly drained; sewage disposal system in disrepair or not functioning properly.Not Critical26.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.717506-73.991771103.01.01600.01003983.01.003070e+09MN27
750052591KAKUREGA JAPANESE CUISINEQueens1334437TH AVE11354.07188868668Japanese01/23/2019Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical23.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.760679-73.833428407.020.087100.04532162.04.049720e+09QN22
850037148DUNKINStaten Island585VETERANS ROAD WEST10309.07186733298Donuts09/08/2021Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical7.0A09/08/202111/07/2021Cycle Inspection / Initial Inspection40.548477-74.221303503.051.022600.05159203.05.071030e+09SI11
940921141ALBERTO'SQueens9831METROPOLITAN AVE11375.07182687860Italian02/01/2019Violations were cited in the following area(s).15F6Workplace SFAA policy not prominently posted in workplaceNot CriticalNaNNaNNaN11/07/2021Smoke-Free Air Act / Initial Inspection40.710905-73.853662406.029.072900.04076686.04.032070e+09QN17

Last rows

CAMISDBABOROBUILDINGSTREETZIPCODEPHONECUISINE DESCRIPTIONINSPECTION DATEACTIONVIOLATION CODEVIOLATION DESCRIPTIONCRITICAL FLAGSCOREGRADEGRADE DATERECORD DATEINSPECTION TYPELatitudeLongitudeCommunity BoardCouncil DistrictCensus TractBINBBLNTA
38323540931929FRANKIE & JOHNNIE'S STEAKHOUSEManhattan32WEST 37 STREET10018.02129478940Steakhouse03/16/2017Violations were cited in the following area(s).04MLive roaches present in facility's food and/or non-food areas.Critical12.0A03/16/201711/07/2021Cycle Inspection / Initial Inspection40.750830-73.984368105.04.08400.01015948.01.008380e+09MN17
38323650065362THE GREAT AMERICAN BAGEL BAKERYManhattan200BROADWAY10038.06465967105American05/08/2018Violations were cited in the following area(s).04AFood Protection Certificate not held by supervisor of food operations.Critical32.0C05/08/201811/07/2021Pre-permit (Operational) / Re-inspection40.710526-74.009353101.01.01502.01089384.01.000790e+09MN25
38323750042450EGGER'S ICE CREAM PARLORStaten Island1194FOREST AVENUE10310.07189816534Frozen Desserts04/09/2018Violations were cited in the following area(s).08AFacility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist.Not Critical21.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.626447-74.129757501.049.015100.05009718.05.003540e+09SI07
38323850037651SABOR TROPICALManhattan143SHERMAN AVENUE10034.02123045144Latin American07/13/2016Violations were cited in the following area(s).04NFilth flies or food/refuse/sewage-associated (FRSA) flies present in facility’s food and/or non-food areas. Filth flies include house flies, little house flies, blow flies, bottle flies and flesh flies. Food/refuse/sewage-associated flies include fruit flies, drain flies and Phorid flies.Critical24.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.864534-73.923368112.010.029100.01064764.01.022210e+09MN01
38323950010628GREENPOINT FISH & LOBSTER CO.Brooklyn114NASSAU AVENUE11222.07183490400Seafood07/24/2018Violations were cited in the following area(s).06APersonal cleanliness inadequate. Outer garment soiled with possible contaminant. Effective hair restraint not worn in an area where food is prepared.Critical7.0A07/24/201811/07/2021Cycle Inspection / Initial Inspection40.724280-73.949202301.033.057100.03066776.03.026810e+09BK76
3832405006884752ND SUSHIQueens5221ROOSEVELT AVE11377.07185079204Japanese05/25/2019Violations were cited in the following area(s).10BPlumbing not properly installed or maintained; anti-siphonage or backflow prevention device not provided where required; equipment or floor not properly drained; sewage disposal system in disrepair or not functioning properly.Not Critical13.0A05/25/201911/07/2021Cycle Inspection / Re-inspection40.744256-73.912196402.026.025302.04030800.04.013150e+09QN31
38324150066349SWEET CHICKBrooklyn636CARLTON AVENUE11238.07184847724American05/21/2019Violations were cited in the following area(s).02GCold food item held above 41º F (smoked fish and reduced oxygen packaged foods above 38 ºF) except during necessary preparation.Critical12.0A05/21/201911/07/2021Cycle Inspection / Re-inspection40.677575-73.972312308.035.016100.03028687.03.011570e+09BK64
38324250011214THE CRABBY SHACKBrooklyn613FRANKLIN AVENUE11238.07184841570Seafood05/16/2019Violations were cited in the following area(s).08AFacility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist.Not Critical11.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.677400-73.955526308.035.022100.03030246.03.012110e+09BK61
38324350011892ROBUSTA ESPRESSO BARManhattan50WEST 47 STREET10036.02123020242Coffee/Tea09/01/2021Violations were cited in the following area(s).08AFacility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist.Not Critical34.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.757326-73.980187105.04.09600.01034346.01.012628e+09MN17
38324450013768SUBWAY CAFEManhattan351WEST 42 STREET10036.02129918945Sandwiches12/06/2017Violations were cited in the following area(s).02GCold food item held above 41º F (smoked fish and reduced oxygen packaged foods above 38 ºF) except during necessary preparation.Critical17.0NaNNaN11/07/2021Cycle Inspection / Initial Inspection40.757841-73.991229104.03.012100.01024937.01.010330e+09MN15

Duplicate rows

Most frequent

CAMISDBABOROBUILDINGSTREETZIPCODEPHONECUISINE DESCRIPTIONINSPECTION DATEACTIONVIOLATION CODEVIOLATION DESCRIPTIONCRITICAL FLAGSCOREGRADEGRADE DATERECORD DATEINSPECTION TYPELatitudeLongitudeCommunity BoardCouncil DistrictCensus TractBINBBLNTAcount
440356483WILKEN'S FINE FOODBrooklyn7114AVENUE U11234.07184443838Sandwiches05/03/2019Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical13.0A05/03/201911/07/2021Cycle Inspection / Initial Inspection40.620112-73.906989318.046.070000.03237693.03.084310e+09BK453
640359705NATHAN'S FAMOUSBrooklyn1310SURF AVENUE11224.07183332202Hotdogs03/07/2018Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical10.0A03/07/201811/07/2021Cycle Inspection / Initial Inspection40.575537-73.981652313.047.035200.03189660.03.070740e+09BK213
2040363685BERKELEYManhattan437MADISON AVENUE10022.02128328121American07/25/2017Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical9.0A07/25/201711/07/2021Cycle Inspection / Initial Inspection40.757512-73.975827105.04.010200.01035455.01.012850e+09MN173
29403645187B BARManhattan108AVENUE B10009.02126776742American06/05/2018Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical10.0A06/05/201811/07/2021Cycle Inspection / Re-inspection40.724985-73.981283103.02.03200.01005094.01.004020e+09MN223
3240364581JUNIOR'SBrooklyn386FLATBUSH AVENUE EXTENSION11201.07188525257American01/24/2019Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical12.0A01/24/201911/07/2021Cycle Inspection / Initial Inspection40.690200-73.981624302.033.01500.03000369.03.001490e+09BK383
3840365239DORRIAN'S RED HAND RESTAURANTManhattan16162 AVENUE10028.02127726660Irish11/08/2018Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical11.0A11/08/201811/07/2021Cycle Inspection / Initial Inspection40.776405-73.952802108.05.013800.01049947.01.015460e+09MN323
6240366230DON PEPPEQueens13558LEFFERTS BOULEVARD11420.07188457587Italian09/13/2018Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical13.0A09/13/201811/07/2021Cycle Inspection / Initial Inspection40.670258-73.820976410.032.083800.04256605.04.118050e+09QN553
7140366473TIO PEPEManhattan168WEST 4 STREET10014.02122429338Seafood06/19/2019Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical7.0A06/19/201911/07/2021Cycle Inspection / Initial Inspection40.732057-74.001443102.03.06700.01010145.01.005900e+09MN233
10040367841CAFE FIORELLOManhattan1900BROADWAY10023.02125955330Italian03/10/2020Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical13.0A03/10/202011/07/2021Cycle Inspection / Re-inspection40.771511-73.982150107.06.014900.01027472.01.011168e+09MN143
11540368577DENNY'S PUBBrooklyn106BEVERLY ROAD11218.07184352156American07/26/2018Violations were cited in the following area(s).10FNon-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.Not Critical12.0A07/26/201811/07/2021Cycle Inspection / Initial Inspection40.642932-73.978870312.039.048600.03125226.03.053530e+09BK413